LiDAR-based reference aboveground biomass maps for tropical forests of South Asia and Central Africa

Accurate mapping and monitoring of tropical forests aboveground biomass (AGB) is crucial to design effective carbon emission reduction strategies and improving our understanding of Earth’s carbon cycle. However, existing large-scale maps of tropical forest AGB generated through combinations of Earth Observation (EO) and forest inventory data show markedly divergent estimates, even after accounting for reported uncertainties. To address this, a network of high-quality reference data is needed to calibrate and validate mapping algorithms. This study aims to generate reference AGB datasets using field inventory plots and airborne LiDAR data for eight sites in Central Africa and five sites in South Asia, two regions largely underrepresented in global reference AGB datasets. The study provides access to these reference AGB maps, including uncertainty maps, at 100 m and 40 m spatial resolutions covering a total LiDAR footprint of 1,11,650 ha [ranging from 150 to 40,000 ha at site level]. These maps serve as calibration/validation datasets to improve the accuracy and reliability of AGB mapping for current and upcoming EO missions (viz., GEDI, BIOMASS, and NISAR).


Background & Summary
Tropical forests play a vital role in the Earth's carbon cycle and contribute largely to uncertainties in the global carbon budget 1 .Methods to accurately map and monitor tropical forest carbon -or aboveground biomass (AGB) -are thus urgently needed to improve Earth system models and to help design carbon emission mitigation strategies in the context of Reducing Emissions from Deforestation and forest Degradation (REDD+) 2,3 .In the last decade, spaceborne Earth Observation (EO) data in combination with forest inventory measurements have been extensively used to generate spatially continuous AGB maps at pan-tropical scale using different modelling strategies [3][4][5][6] .However, existing broad-scale maps show divergent estimates among themselves and differ from field-derived forest AGB stocks at different spatial scales 1,4,5,7 , indicating the presence of high uncertainties in prediction maps.To improve the accuracy and reliability of AGB maps over the tropics, several ongoing and upcoming EO missions (NASA's GEDI, ESA's BIOMASS, NASA-ISRO's NISAR and JAXA's ALOS-4 missions, notably) have been specifically designed to collect satellite data sensitive to forest structure, hence to forest AGB 6,[8][9][10] .While these new spaceborne datasets will undoubtedly revolutionise broad-scale forest AGB mapping, a network of high-quality reference data is needed to calibrate and validate the mapping algorithms 11,12 .Besides, using the same sets of reference data across different EO missions would vastly improve the comparability and confidence in the derived AGB maps, enabling their use in a wide range of science, policy, and management applications 13 .
Generating reference AGB observations over a given area is challenging since forest AGB is not directly measured through destructive sampling, but instead estimated from tree inventories and a series of statistical models propagating substantial uncertainties 5 .It is therefore required to reduce as far as possible and quantify the uncertainty on reference AGB predictions.In this context, the Committee on Earth Observation Satellite (CEOS) has established a good practice protocol for generating reference AGB dataset, to facilitate the # A full list of authors and their affiliations appears at the end of the paper.

Data DESCRIptOR
OpEN production and warrant the consistency of next-generation biomass products 14 .The protocol suggests developing reference AGB maps over sizable areas using local forest sample plots and LiDAR data acquired using aerial platforms (hereafter airborne LiDAR).Airborne LiDAR data is currently the most informative data type for characterizing forest structure and deriving AGB maps at the landscape scale, provided they are adequately calibrated with respect to local environmental gradients and forest structural and species variability 15 .Besides, the high spatial resolution of airborne LiDAR data (or of derived AGB predictions) can easily be aggregated to coarser resolutions, thus bridging the scale gap between field data and the resolution of upcoming EO sensors (e.g. 100 m for NISAR, 200 m for BIOMASS).
The establishment and long-term maintenance of a network of reference forest AGB observatories across the tropics entails a myriad of challenges, particularly concerning the representativeness of the network 12 .Ideally, the network should be relatively evenly distributed in space and cover the main environmental gradients.While scientific discussions on site selection are on-going 12 , the Global Ecosystem Dynamics Investigation (GEDI) sensor on-board the International Space Station has already acquired data for a longer period than its initially projected lifetime.Data users would benefit from open-access reference AGB data, particularly in Asia where large geographic regions are not represented in the calibration/validation dataset of GEDI biomass mapping algorithm 16,17 .Besides the notion of spatial representativeness, hurdles related to the temporal mismatch between reference AGB and EO data should not be neglected.Rapid growth in regenerating forests or forest clearing/degradation -which notably characterise rural landscapes around central African cities, where slash-and-burn agriculture induces relatively fast dynamics -could rapidly make tens of thousands of GEDI data shots unusable.We argue that airborne LiDAR data acquired during GEDI lifetime over rapidly changing landscapes are invaluable and should be utilized to improve GEDI biomass mapping algorithms, notably on the lower-end of the forest biomass gradient to best capture forest degradation and regeneration gradients.
In this context, we aim to generate reference biomass datasets over the tropics for eight sites in Central Africa and five sites in South Asia (Fig. 1) by calibrating airborne LiDAR data with locally established field plots.This paper briefly describes (1) the details of the study sites and the datasets used, (2) the methodology used to generate the reference AGB maps (Fig. 2), and (3) the Monte Carlo simulation workflow used to generate uncertainty maps along with each reference AGB map.Finally, this paper provides access to these reference AGB datasets generated at 100 m and 40 m spatial resolutions over airborne LiDAR footprints ranging from 100 to 40,000 ha.

Methods
Sampling sites and associated inventory and LiDaR datasets.We compiled co-located forest inventory and LiDAR datasets from 13 sampling sites in Central Africa and South Asia encompassing an array of abiotic conditions, forest types and structures (Fig. 1; Tables 1 and 2).Forest inventories were carried out at each site, and LiDAR datasets were obtained with an absolute temporal difference of 2.2 ± 1.9 years (range: 0-6.2 years) from the field measurements.
Forest inventories were conducted by different teams but followed similar protocols.In each plot, the diameter at breast height (DBH or referred to as D in this study, with D ≥ 10 cm) and the taxonomic identification of each tree were recorded.Tree relative coordinates within the plots were measured either at the individual or at the 20 × 20 m quadrat level.For a subsample of trees within the plots, tree height (H) was measured using a laser  1. rangefinder device.Finally, plot geographic coordinates were determined using points measured every 20 m along the plot borders using a combination of differential GPS measurement system and electronic total station (in Asia) or a regular GPS system (in Africa), to warrant an accurate link between ground and remote-sensing data.The complete inventory dataset includes information on D and H measurements for respectively 97,251 and 13,303 trees, and identification rates of 89% at the species level and 92% at the genus level (8% of the trees were left unidentified).The number, size and layout of the inventory plots are uneven across sampling sites with, e.g., a single large 25-ha plot in the Forest-Geo "Rabi" site, a large 30-ha plot and smaller plots of 1-ha and 0.48-ha in the "Khao Yai" site, or a varying number of scattered 1-ha plots (ranging from 2 to 16 in the "Atout" and "Achanakmar" site, respectively).In general, the inventoried extent per site is smaller in Africa (9 ± 8 hectares) than in Asia (27 ± 13 hectares).For a breakdown of plot number, size, tree measurements and identification rates per sampling site, please refer to Table 3.
The LiDAR data at each sampling site were acquired between 2012 and 2022 using either aircraft or unmanned aerial vehicles (UAV, Table 1).
Inventory data processing: computation of reference aGB predictions.Forest inventories were first split into 1-ha (i.e. 100 × 100 m) and 0.16-ha (i.e.40 × 40 m) plots, using information on tree location recorded in the field (i.e.either individual tree location or quadrat number).The two plot sizes correspond to the two mapping resolutions considered in this study.The 40-m resolution was chosen to account for plots where individual tree locations were only recorded at 20 × 20 m quadrat-level.In cases where the original plot size was not a multiple of the desired output size (typically when splitting 100 × 100 m plots into 40 × 40 m plots), subplots of the desired outputs size were selected at the edges of the original plot, thus leaving-out parts of the original inventory dataset (20 m wide bands in the center as per the previous example).The resulting number of 1-ha and 0.16-ha plots compiled at each sampling site is provided in Table 3.
Subsequently, the BIOMASS R package 18 (version 2.1.8)within the R statistical platform (version 4.1.3)was used to compute reference AGB predictions for forest inventory plots at the two spatial resolutions (1-ha and 0.16-ha).To that end, we differentiated sites with a cumulated forest inventory area of 10 ha or more (i.e., 8 out of 13 sites, Tables 1 and 3) from those with less than 10 ha of cumulated forest inventory area (i.e., 5 sites).In the former case, we developed site-specific tree height-diameter (H-D) allometric models using second-order polynomials on log-transformed data (modelHD function in the BIOMASS package) and these models were used to predict the height of trees without H measurements in each respective site.In the latter case, which pertained to sites located in moist dense forests of Cameroon (SiteIDs 7 to 11 in Table 1), all inventory data from that country and biome were pooled into a single training dataset and the same H-D modelling procedure was applied.The resulting country-and biome-specific model was then used for predicting tree height at those sites.The H-D model coefficients for these site-level and Cameroon level model are presented in Table 4. Next, a wood density (WD) estimate was attributed to each tree based on its taxonomic identification using the getWoodDensity function.Considering that tree AGB prediction is associated with various sources of uncertainty (including measurement errors of the independent variables such as tree diameter, height, and wood density, as well as prediction errors of the H-D models and the AGB allometric model) 5,14 , we used a Monte Carlo approach for uncertainty propagation.Specifically, we employed the AGBmonteCarlo function of the BIOMASS package 18 , which allows propagating the above-mentioned sources of uncertainty and outputs 1000 tree-level and subsequently plot-level AGB predictions.Tree AGB predictions were made using the pantropical AGB allometric model (i.e., Equation-4 in Chave et al. 19 ).For each plot, the 1000 AGB predictions were (i) averaged to obtain a reference plot-level AGB density (hereafter AGB REF ) for the development of LiDAR-AGB models and (ii) used for the propagation of uncertainties to the final AGB maps (see section "mapping forest AGB and prediction uncertainty").

LiDaR data processing: computation of canopy height metrics. LiDAR data from African and
Asian sites were processed using LAStools (version 201124) and the lidR R package (version 4.0.1),respectively.The same processing chain was applied to generate the canopy metrics in both cases.First, a digital surface model (DSM) free of pits and spikes was generated at a 1-m resolution by interpolating the highest points on a 1-m grid.Second, a ground point classification was performed on the point cloud and a digital terrain model (DTM) was interpolated from ground-points.The canopy height model (CHM) was then derived by subtracting the DTM from the DSM.Finally, the 1-m CHM was used to compute 15 canopy metrics for each plot (

Specification of a general AGB model form.
While LiDAR-based AGB mapping models were trained at the site or regional level (for some Cameroon sites), to minimise local bias in model predictions 14,20 , we privileged the use of (i) a single AGB model form across all sites to facilitate sites inter-comparison and the subsequent use of AGB predictions for spaceborne products calibration/validation and (ii) a simple, parametric modelling approach, keeping the number of predictors to a minimum to avoid overfitting and multicollinearity issues.To specify the AGB model form, we used linear mixed-effects models to identify the most predictive LiDAR-derived canopy height metrics (LCMs) on AGB REF variation while accounting for the hierarchical spatial structure of the data.In practice, we built 15 linear mixed-effects models (one for each LCM) on the log-transformed variables of AGB REF and LCM (Eq.1): where a and b are the model's coefficients, LCM represents the Lidar-derived Canopy Metric, AGB REF corresponds to the field-derived AGB prediction at a given spatial resolution (i.e.0.16-or 1-ha), RE site denotes the random site effect used in linear mixed-effects modelling and ε is the error term, assumed to follow a normally distribution with a mean of zero and a standard error σ.Based on the AIC criterion, the meanTCH metric (i.e. the mean of all CHM values in the plot area) emerged as the best predictor of AGB REF variation at both 1-ha and 0.16-ha spatial resolutions (Table 6).
A similar procedure was run on AGB REF prediction models combining each pair of LCMs rather than a single predictor.At both spatial resolutions, the best two-predictor model resulted in a modest improvement in relative RMSE (i.e., <0.2%, Table 7) compared to the model based on meanTCH only.The latter model form was thus selected for biomass mapping.In line with the H:D modelling procedure, LiDAR-based AGB mapping models were either trained at the site-level (for sites with a cumulated forest inventory area of 10 ha more) or on a pooled training dataset containing all inventory data from Cameroonian moist dense forests (for sites with a cumulated , where H is the height of the tree and D is the tree diameter.ε is the normally distributed error to be used during back-transformation for Baskerville correction. forest inventory area smaller than 10 ha), henceforth referred to as the "regional" AGB model.It is noteworthy that including sites as an additional fixed-effect covariate in the regional model did not yield significant effects for this variable at a 5% risk (neither in terms of site-level intercepts nor in terms of interactions between sites and the meanTCH predictor), suggesting a minimal site effect on the regional model's predictions, if any.
The coefficients and calibration statistics of LiDAR-based AGB mapping models are provided in Table 8, while Fig. 3 shows scatterplots of 'reference' against predicted AGB values.
Mapping forest aGB and prediction uncertainty.We mapped forest AGB and prediction uncertainty over the extent of airborne LiDAR data at each site using a Monte Carlo approach similar to that used to compute plot-level AGB REF .More specifically, we used the 1000 plot-level AGB predictions generated at the first modelling level (i.e., from tree to plot) to build 1000 LiDAR-based models per site (or at "regional" level for Cameroonian sites with less than 10 ha of cumulated forest inventory area).At the second modelling level (i.e., from plot to landscape), pixel AGB predictions derived from LiDAR-based models suffer from additional uncertainty associated to the LiDAR-based models themselves.To propagate this additional uncertainty, we mimicked the procedure used in BIOMASS to propagate the uncertainty associated to the tree-level AGB allometric model (see Appendix S1 of Réjou-Méchain et al. 18 for codes and details), which entailed using a Markov chain Monte Carlo algorithm to infer the uncertainty on Lidar-based models' parameters (i.e., models' coefficients and associated RSE).The Markov chain outputted 1000 sets of model parameters per model.For each of the 1000 LiDAR-based model at each site, we then (1) randomly selected a set of parameters among the 1000 available sets, (2) used the  model coefficient selected in (1) to predict pixels AGB and (3) added to all pixels an error term randomly drawn from a normal distribution N(0, RSEi) where RSEi is the model RSE selected in (1).This procedure led to 1000 predictions of pixels AGB embedding the prediction uncertainty from both the first and second modelling levels.Finally, reference AGB maps and associated spatial uncertainty maps were generated as the mean and standard deviation of the 1000 pixel AGB predictions, respectively.Hereafter, we refer to pixels mean AGB prediction as AGB PRED .
additional metadata for the aGB maps.LiDAR-based AGB maps produced in the present study are intended to support calibration and validation efforts of spaceborne data.To maximise their usefulness, we provide additional information that users may require -depending on their study's objective and methodological choices -to facilitate their integration with spaceborne data and/or develop comprehensive uncertainty propagation schemes up to the final, spaceborne-derived AGB map.
A first challenge users may face relates to the computation of the uncertainty associated with the mean AGB of arbitrary subregions of LiDAR AGB maps.Such subregions could for instance correspond to the footprints of spaceborne data unit pixels.Estimating the total mean squared error associated with a map (sub) population mean requires access to the matrix of pairwise population unit covariances, which is rarely communicated by map makers to users because of its large size.Yet, McRoberts et al. 21recently showed that pairwise population unit covariances could largely contribute to total mean squared error, and proposed an averaging and binning approach to drastically reduce the matrix size, thus facilitating its publication along with AGB maps.While we refer interested readers to McRoberts et al. 21for methodological details, we provide in Supplementary data all information recommended by the authors to allow map users to comply with IPCC good practice guidelines for Table 8.Model coefficients along with standard errors (in brackets) for site-wise level models at 1-ha and 0.16-ha resolution.For Cameroon sites listed from 7-11 in column "Sno", a single regional model is employed.Sigma is the model residual standard error in log-transformed units.R 2 and RMSEs (in Mg ha-1 and in %) are computed on back-transformed predictions.
greenhouse gas inventories.We note that for each pixel of the LiDAR-based AGB maps provided in this study, a bin number is available in the third map layer.Another challenge lies in the propagation of uncertainties in multi-level hierarchical modelling, which is a likely use-case of the LiDAR-based maps we produced.These maps were generated by applying two hierarchically nested models: a tree allometric model linking field measurements to tree AGB, and a mapping model linking plot AGB to LiDAR data.LiDAR-based AGB maps users may employ a three-steps hierarchical modelling approach and add as a third step a model linking high resolution AGB predictions from the LiDAR-based maps to the coarser resolution of spaceborne data.An example of such an approach is presented in detail in Saarela et al. 22 and referred to as "three-phase hierarchical model-based inference".The uncertainty assessment in such a nested modelling approach requires information at the two first modelling steps that goes beyond the results of the Monte Carlo simulation we used to produce pixel-level uncertainty estimates.While we refer interested readers to Saarela et al. 22 for methodological details, we provide in Supplementary data all information allowing users Fig. 3 LiDAR-AGB models of Asian and African sites at 1-ha and 0.16-ha resolutions.The numbers for each site refer to Table 1.(7-11)* refers to the regional model established over moist dense forests of Cameroon.to assess uncertainty as described in Saarela et al. 22 .This information notably includes the variance-covariance matrix of model parameters for each sampling site as well as statistics on parameters (DBH, AGB, pixels' height from CHM, etc.,) used at various levels in the chain of hierarchical models.

Data Records
For each site, AGB and uncertainty maps are distributed as a single GeoTiff file at the two spatial resolutions (1-and 0.16-ha) through Dataverse 23 .Each file comprises three individual layers.The two first layers named meanAGB and sdAGB correspond to the mean and standard deviation of AGB predictions over the 1000 Monte Carlo simulations, respectively.The file projection system is Universal Transverse Mercator.The third layer named Nbin corresponds to the bin number each map pixel is associated with in the binning approach proposed by McRoberts et al. 21to allow users reconstituting a matrix of pairwise population unit covariances estimates.
Besides data access through Rodda et al. 23 , data from Asian sites can be access and visualized through the Bhuvan Portal (https://bhuvan-app3.nrsc.gov.in/data/download/index.php).To access the visualisation/download through Bhuvan Portal, select the ISRO Geosphere-Biosphere Programme under the Program option and then choose the group Above Ground Biomass (AGB) Data.
In addition, we provide two supplementary data files (in excel format) that provide additional metadata details on site-level binned covariance matrices and variance-covariance matrices and summary of all the parameters used in the present study.

technical Validation
Reference AGB maps at 1-ha resolution are shown in Fig. 4 and the density distributions of 1-ha AGB maps are represented in Fig. 5-A along with uncertainty levels in Fig. 5B expressed as a coefficient of variation (CV, in % of mean AGB) (see Fig. 6 and Fig. 7 for AGB maps and respective density distributions at 0.16-ha resolution).Figure 5-B shows that the mean uncertainty across sites is 15.4%, with site-level mean uncertainty ranging from 10.8 to 31%.It can be observed that Nachtigal and Uppangala sites have larger mean uncertainties than other sites, with 31% and 20.1%, respectively.This can be explained by larger LiDAR-AGB model uncertainties at these sites and mapping resolution (see models' "sigma" in Table 8).
In addition to the per pixel estimates of uncertainty accompanying AGB maps, we hereafter provide (i) an assessment of mapping model predictive performances using a spatial model cross-validation technique 24 , to provide additional insights into the reliability of AGB predictions on each map and (ii) an assessment of mapping models extrapolation at each sampling site, which may be useful to help users for filtering-out pixels where extrapolation occurred and only retaining the highest quality AGB predictions for spaceborne products calibration/validation. 8 likely overestimate model predictive performance on pixels that are not used for model training, that is, on most maps' pixels.We performed a cross-validation (CV) of each model to provide more reliable insights into model predictive performance.Field plots at each site are iteratively split into training and test data and model CV statistics are built on the set of test data predictions.Regarding CV design, we selected a buffered leave-one-out cross-validation (LOO-CV 25 ) where a spatial buffer around test data is used to exclude from model training dataset observations located at the neighbourhood of test data, thus avoiding inflation in CV statistics due to spatial autocorrelation in forest AGB 24 .As a compromise between the diversity in terms of number and spatial arrangement of field data across sites (e.g.multiple individual 1-ha plots vs. single large plot), the consistency of the CV approach across sites, as well as our expectation for a relatively weak spatial autocorrelation in forest AGB at the high resolution of the maps (<100 m 2 ) 26 , we selected a LOO-CV with a 100 m buffer radius for all sites and mapping resolutions (i.e. 100 × 100 m and 40 × 40 m).This CV design notably implies that (i) when a test observation came from a large field plot (i.e.>1-ha, e.g. the 25-ha plot at Rabi), subplots at its direct neighbourhood were not used for model training (i.e., all subplots intersecting a 100 m circular buffer around the center of a test subplot were excluded from the training set, regardless of the mapping resolution), and (ii) at the 40 × 40 m mapping resolution, when a test observation came from a 1-ha field plot, the remaining three subplots of that 1-ha plot were not used for model training.The results of the buffered LOO-CV are presented in Table 9.They show that the predictive Fig. 5 Density distributions of (A) mean pixel AGB and (B) AGB uncertainty, expressed as a coefficient of variation (CV, in %), at 1-ha mapping resolution across sites.performances of mapping models developed in this study are comparable to those found in the literature (i.e.15-20% on average for the tropical forest biome 15 ) with relative RMSEs ranging from 10.6 to 20.1% (mean across sites: 14.1%) at 1-ha and 17.7 to 33.7% (mean across sites: 25.7%) at 0.16-ha.

Model extrapolation in the predictor space. Uncertainty maps, AGB maps, and model CV results
provide insights into the reliability of AGB predictions within the calibration domain of mapping models.It is however likely that the entire gradient of forest structure sampled by LiDAR data was not fully sampled in the model's training set, thus leading to situations of predictive extrapolation where prediction uncertainty is unknown.To investigate this issue, we compared the range of vegetation height (i.e., meanTCH) sampled by the training set of each mapping model to the full range found in the LiDAR data, restricting the analysis to pixels considered as vegetated, i.e., with meanTCH ≥ 2 m.We found that the proportion of pixels affected by predictive extrapolation strongly varied across sites and at the two mapping resolutions.Generally, the upper range of meanTCH (and thus of AGB PRED ) found at a landscape scale in the LiDAR data were sampled in the training set (Fig. 8A,B), which probably is a reflection of the "majestic forest bias" 27 -that is, the tendency for researchers to preferentially establish sample plots where forest stands appear the less disturbed (e.g.tallest canopy height, the highest abundance of large trees, etc.).However, a varying and often substantial proportion of maps on the lower end of the meanTCH gradient was outside the model's calibration domain.For instance, in the Nachtigal site predictive extrapolation occurred on about 83% of the vegetated pixels on the 1-ha AGB map.This can be explained by the nature of this site, a forest-savanna mosaic, where the meanTCH of all herbaceous and shrubby savannas is lower than the height of the smallest 1-ha forest stand (ie., 16.4 m) found in model training set (Fig. 8A).However, this proportion dropped to 0% at the 0.16-ha mapping resolution thanks to the inclusion into the model training set of 18 additional 0.16-ha plots established in savannas-dominated areas (Fig. 8B, Table 3).
We thus advise potential users of the AGB maps published here to carefully consider the bounds of the mapping model calibration domain at each site (for which we provided corresponding AGB PRED values in Fig. 8) when using these maps as reference data for larger-scale product calibration/validation.Table 9. Error statistics of modified LOO-CV procedure at site-level for 1-ha and 0.16-ha plots.

Usage Notes
Forest AGB maps released here constitute reference estimations for the community of remote-sensing scientists interested in forest carbon stocks.For instance, we expect these maps to be of utmost usefulness for the calibration and validation of next-generation broad-scale aboveground biomass mapping models based on data from ongoing or upcoming spaceborne missions (viz., NASA"s GEDI, NASA-ISRO's NISAR and ESA's BIOMASS missions).These data can also be useful when assessing the accuracy of existing maps or recalibrating them (as in eg. 28,29), especially since study sites presented here are located on renowned data-poor regions 16 and are marked by notable uncertainties in AGB estimates 17 .That said, we encourage users to account for the dates of LiDAR and ground data acquisitions underlying reference AGB estimates (cf.Table 1), as temporal discrepancies with spaceborne signals or products should ideally be accounted for in any calibration or validation exercise.More broadly, our study highlights sites of potential interest to build a network of "super-sites" (sensu 11 ) across the tropics, that is sites combining forest inventory data over sizable areas (≥10 ha) -ideally featuring multiple forest censuses -with airborne LiDAR data.Such data have been collected thanks to the long-term vision of few organisations, to dedicated experts and to the efforts of trained labour forces in the past decades.Ongoing global changes makes the sustained monitoring of permanent forest plots in long-term study sites critical, so as to allow measuring their impacts on forest ecosystems.In spite of this crucial stake, access to funding in the tropical world for replacement of expertise, training programs and to support field data acquisition campaigns is critically limited.We thus urge National and International research and space agencies to ensure long-term funding for on-ground forest research in the tropics.

Fig. 1 (
Fig. 1 (A) Overview map showing the locations of sampling sites (n = 13) used in the current study.Outlined regions are expanded in (B): South Asian region and in (C): Central African region).Sampling site names and descriptions associated with site numbers are provided in Table1.

Fig. 2
Fig. 2 Flow chart depicting workflow of the data analysis procedure to generate reference AGB datasets.
Mean of CHM values (in m) sdH Standard deviation of CHM values (in m) CV Coefficient of variation of CHM values (meanTCH divided by sdH) QMCH Quadratic mean of CHM values CCF2 Percentage of CHM values above 2, 5 and 10 m (in %) CCF5 CCF10 rumple Roughness of CHM surface (rumple_index function in lidR R package)

Fig. 4
Fig. 4 Reference AGB maps of Asian and African sites at 1-ha spatial resolution.

Fig. 6
Fig. 6 Reference AGB maps of Asian and African sites at 0.16-ha spatial resolution.

Fig. 7
Fig. 7 Density distributions of (A) mean pixel AGB and (B) AGB uncertainty, expressed as a coefficient of variation (CV, in %), at 0.16-ha mapping resolution across sites.

Fig. 8
Fig. 8 Proportion (in %) of map pixels outside and inside models calibration domains at 1-ha (panel A) and 0.16-ha (panel B) mapping resolutions.The proportions are computed with respect to the total number of map pixels with CHM > 2 m at the exception of the Natchigal site where a 0.4 m threshold is used so as to account for the nature of the site i.e., a forest-savanna mosaic.The proportion of map pixels within model calibration domains is represented in red.Map pixels below and above the range of model calibration domains are represented in blue and green, respectively.

Table 5 )
as candidate predictors of forest AGB.

Table 1 .
Sampling site details on forest types, inventory statistics and characteristics of the LiDAR acquisitions.Area INV indicates the total area of field inventories, LiDAR Date indicates the month and year of acquisition of LiDAR data and LiDAR Area indicates the total area covered by LiDAR data over the site.N range and BA range indicate the range in number of trees and basal area per hectare across the inventoried area, respectively.The associated plant functional types (PFT's) for each site are derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) Land Cover Type product (MCD12Q1) which follows Land Cover Type 5 Classification Scheme; a similar strategy is adopted by GEDI Mission.

Table 2 .
Environmental conditions across sampling sites.Over the LiDAR acquisition area, the statistics (mean ± standard deviation) of elevation and slope are computed using SRTM at 30-m spatial resolution (V3 product), Mean Annual Temperature (MAT) and Mean Annual Precipitation (MAP) are computed using WorldClim Version 2.1 data.

Table 3 .
Site-level details on field plot layout description, the number of compiled plots at 1-ha (N 1ha ) and 0.16ha (N 0.16ha ), number of total trees across all plots (N Trees ), number of trees measured for height (N Tree_hts ).Species, Genus and Family (%) stands for the identification rate (in %) at the given taxomonic level.

Table 4 .
H-D Model coefficients (a, b, c) of the 2 nd order log-log polynomial model form (

Table 5 .
List of canopy metrics derived from LiDAR-derived CHMs over forest plots extent.

Table 6 .
LiDAR -AGB Linear Mixed Effects Model performance statistics at 1-ha and 0.16-ha plot sizes.The table is sorted in ascending order based on the column "AIC" (Akaike information criterion) when the respective LiDAR Canopy Metric (LCM) is used for Eq. 1. R 2 and RMSEs (in Mg ha-1 and in %) are computed on back-transformed predictions.

Table 7 .
LiDAR-AGB Linear Mixed Effects Model performance statistics at 1-ha and 0.16-ha plot sizes using two LCMs as predictive variables.The table is sorted in ascending order based on the column "AIC" (Akaike information criterion) when the respective LiDAR Canopy Metrics (LCM) are used in Eq. 1. R 2 and RMSEs (in Mg ha −1 and in %) are computed on back-transformed predictions.